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DETAILED ACTION 
Response to Amendment 

1 . In response to the Office Action mailed July 25, 2006, applicant submitted an 
amendment filed on October 24, 2006, in which the applicant traversed and requested 
reconsideration. 

Response to Arguments 

2. Applicants argue that Yu et al. does not teach or suggest "presentation means for 
presenting to the user an unmatched portion between the recognized character string 
pattern and the recording character string pattern. Instead, Yu et al. teaches that the 
mismatch is between the character written by the user and the entries in the resident 
dictionary. Applicant's arguments are persuasive, but are not moot in view of new 
grounds. 

Applicants further argue that the combination of the document with Keiller et al. is 
impermissible, however, applicants arguments are persuasive, but are moot in view of 
new grounds of rejection. 

Applicant also argues that Ho et al. does not teach or suggest "display control 
means for controlling displaying of the recording character sting indicating the sentence 
to be recorded. Applicants argue the Ho et al displays text corresponding to speech 
that was already dictated by the user and recorded and converted by the machine. 
Applicants arguments are persuasive, but are moot in view of new grounds of rejection. 

Applicant argues that Chihara does not teach or suggest "re-input instruction 
means for issuing an instruction to input speech once again when it is determined by 
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said determination means that the matching does not exceed the predetermined level. 
Applicant further argues that Chihara relates to speech synthesis, not speech 
recognition and the there is not motivation to combine. However, Applicants arguments 
are moot in view of new grounds of rejection. 



Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1, 8, 15-18 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Keiller (USPN 6,560,575) in view of Jochumson (USPN 6,865,536) in view of Kirby 
et al. (USPN 6,226,615), hereinafter referenced as Kirby and in further view of Brown et 
al. (USPN 6,061 ,654), hereinafter referenced as Brown. 

Regarding claims 1, 8, and 15, Keiller discloses an apparatus, method and 
system for recording speech, to be used as learning data for recognizing input speech, 
comprising: 

storage means for storing a recording character string indicating a sentence to be 
recorded (column 16, lines 12-19); 
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recognition means for recognizing input speech of the displayed sentence that a 
user reads out, and for obtaining a recognized character string (input is taken as two 
training examples: one a new example and one an already existing example; column 

15, lines 25-35) corresponding to the stored recording character string pattern (column 

16, lines 16-19); 

determination means for comparing a pattern of the recognized character string 
with a pattern if the recording character string stored in said storage means so as to 
obtain a matching rate therebetween, and determining whether said matching rate 
exceeds a predetermined level (system checks whether training examples are 
consistent (column 15, lines 28-30) by computing the consistency scores (column 15, 
lines 53-65) and comparing the result again against the threshold (95%, column 16, 
lines 6-8); and 

recording means for recording the input speech as the learning data for 
recognizing speech when it is determined by said determination means that said 
matching rate exceeds a predetermined level (if the results are consistent, they are 
used to generate a model for word being trained (column 15, lines 27-30), so inherently, 
the generated model is stored (recorded) to some memory means (see also column 16, 
lines 12-15), but does not specifically teach display control means, re-input instruction 
means and presentation means. 

Jochumson discloses a speech correction device further comprising presentation 
means for presenting an unmatched portion between the recognized character string 
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pattern (what user has actually verbalized) and the recording character string pattern 
(what is expected; column 2, lines 53-65), to provide results or feedback. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller' s apparatus and method further 
comprising presentation means for presenting an unmatched portion between the 
recognized character string pattern and the recording character string pattern, as taught 
by Jochumson, to provides results and feedback to the user on how correct they were in 
stating the proper word or phrase (column 2, lines 53-65). 

Keiller in view of Jochumson teaches storage means, determination means, 
recording means and presentation means, but does not specifically teach display 
control means and recognition means. 

Kirby discloses a speech recognition device comprising a display control means 
for controlling displaying of the recording character string indicating the sentence to be 
recorded (prompting system to identify the words to be spoken that are presented; 
column 2, lines 31-39 with column 3, lines 1-17), to determine a new match between 
text and speech. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson's apparatus and 
method wherein it comprises a display control means, as taught by Kirby, to determine 
a new match between text and speech, in order to try and regain synchronization 
(column 3, lines 51-65). 
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Keiller in view of Jochumson and Kirby teaches a storage means, display control 
means determination means, recording means and presentation means, but does not 
specifically teach a re-input instruction means. 

Brown teaches a speech synthesis apparatus comprising a re-input means for 
issuing an instruction to input speech once again when it is determined by said 
determination means that the matching rate does not exceed the predetermined level 
(indicates that no such match exists, re-prompt the user to speak again; column 3, lines 
28-52), to present the highest correct character string. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson and Kirby's 
apparatus and method wherein it comprises a re-input instruction means, as taught by 
Brown, to present to the user with a positive match (column 3, lines 28-52). 

Regarding claim 16, Keiller discloses a speech recognition method comprising: 

a learning recognition step of recognizing input speech, of the displayed 
sentence that a user reads out, and for obtaining a recognized character string (input is 
taken as two training examples: one a new example and one an already existing 
example; column 15, lines 25-35); 

a determination step of comparing a pattern of the recognized character string 
with a pattern of a recording character string indicating a sentence to be recorded so as 
to obtain a matching rate therebetween, and of determining whether said matching rate 
exceeds a predetermined level (system checks whether training examples are 
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consistent (column 15, lines 28-30) by computing consistency scored (column 15, lines 
53-65) and comparing the result against a threshold (95%, column 16, lines 6-8)); 

a recording step of recording the input speech as the learning data for 
recognizing speech when it is determined in said determination step that said matching 
rate exceeds a predetermined level (if results are consistent, they are used to generate 
a model for word being trained (column 15, lines 27-30), so inherently, the generated 
model is stored (recorded) to a memory means (column 16, lines 12-19)); 

a learning step of performing learning on a speech model by using the input 
speech recorded in said recording step (the process described above provides general 
training of the model; column 16, lines 14-20); and 

a recognition step of recognizing unknown input speech by using the speech 
model learned in said learning step (training data is used in general recognition; column 
16, lines 14-20), but does not specifically teach display control means, re-input 
instruction means and presentation means. 

Jochumson discloses a speech correction device further comprising presentation 
means for presenting an unmatched portion between the recognized character string 
pattern (what user has actually verbalized) and the recording character string pattern 
(what is expected; column 2, lines 53-65), to provide results or feedback. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller's apparatus and method further 
comprising presentation means for presenting an unmatched portion between the 
recognized character string pattern and the recording character string pattern, as taught 
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by Jochumson, to provides results and feedback to the user on how correct they were in 
stating the proper word or phrase (column 2, lines 53-65). 

Keiller in view of Jochumson teaches storage means, determination means, 
recording means and presentation means, but does not specifically teach display 
control means and recognition means. 

Kirby discloses a speech recognition device comprising a display control means 
for controlling displaying of the recording character string indicating the sentence to be 
recorded (prompting system to identify the words to be spoken that are presented; 
column 2, lines 31-39 with column 3, lines 1-17), to determine a new match between 
text and speech. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson's apparatus and 
method wherein it comprises a display control means, as taught by Kirby, to determine 
a new match between text and speech, in order to try and regain synchronization 
(column 3, lines 51-65). 

Keiller in view of Jochumson and Kirby teaches a storage means, display control 
means determination means, recording means and presentation means, but does not 
specifically teach a re-input instruction means. 

Brown teaches a speech synthesis apparatus comprising a re-input means for 
issuing an instruction to input speech once again when it is determined by said 
determination means that the matching rate does not exceed the predetermined level 
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(indicates that no such match exists, re-prompt the user to speak again; column 3, lines 
28-52), to present the highest correct character string. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson and Kirby's 
apparatus and method wherein it comprises a re-input instruction means, as taught by 
Brown, to present to the user with a positive match (column 3, lines 28-52). 

Regarding claims 17 and 18, Keiller discloses a control program having 
computer readable program code and a speech recognition method, comprising: 

a second program code unit for recognizing input speech of the displayed 
sentence that a user reads out, and for obtaining a recognized character string (input is 
taken as two training examples: one a new example and one an already existing 
example; column 15, lines 25-35); 

a third program code unit for comparing a pattern of the recognized character 
string with a pattern of the recording character string so as to obtain a matching rate 
therebetween, and for determining whether said matching rate exceeds a 
predetermined level system checks whether training examples are consistent (column 
15, lines 28-30) by computing consistency scored (column 15, lines 53-65) and 
comparing the result against a threshold (95%, column 16, lines 6-8); 

a fourth program code unit for recording the input speech as the learning data for 
recognizing speech when it is determined by said determination step that said matching 
rate exceeds a predetermined level (if results are consistent, they are used to generate 
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a model for word being trained (column 15, lines 27-30), .so inherently, the generated 
model is stored (recorded) to a memory means (column 16, lines 12-19); 

a fourth program code unit for performing learning on a speech model by using 
the input speech recorded in said record step (the process described above provides 
general training of the model; column 16, lines 14-20); and 

a eighth program code unit for recognizing unknown input speech by using the 
speech model learned in said learning step (training data is used in general recognition; 
column 16, lines 14-20), but does not specifically teach display control means, re-input 
instruction means and presentation means. 

Jochumson discloses a speech correction device further comprising presentation 
means for presenting an unmatched portion between the recognized character string 
pattern (what user has actually verbalized) and the recording character string pattern 
(what is expected; column 2, lines 53-65), to provide results or feedback. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller's apparatus and method further 
comprising presentation means for presenting an unmatched portion between the 
recognized character string pattern and the recording character string pattern, as taught 
by Jochumson, to provides results and feedback to the user on how correct they were in 
stating the proper word or phrase (column 2, lines 53-65). 

Keiller in view of Jochumson teaches storage means, determination means, 
recording means and presentation means, but does not specifically teach display 
control means and recognition means. 
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Kirby discloses a speech recognition device comprising a display control means 
for controlling displaying of the recording character string indicating the sentence to be 
recorded (prompting system to identify the Words to be spoken that are presented; 
column 2, lines 31-39 with column 3, lines 1-17), to determine a new match between 
text and speech. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson's apparatus and 
method wherein it comprises a display control means, as taught by Kirby, to determine 
a new match between text and speech, in order to try and regain synchronization 
(column 3, lines 51-65). 

Keiller in view of Jochumson and Kirby teaches a storage means, display control 
means determination means, recording means and presentation means, but does not 
specifically teach a. re-input instruction means. 

Brown teaches a speech synthesis apparatus comprising a re-input means for 
issuing an instruction to input speech once again when it is determined by said 
determination means that the matching rate does not exceed the predetermined level 
(indicates that no such match exists, re-prompt the user to speak again; column 3, lines 
28-52), to present the highest correct character string. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson and Kirby's 
apparatus and method wherein it comprises a re-input instruction means, as taught by 
Brown, to present to the user with a positive match (column 3, lines 28-52). 
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5. Claims 5 and 12 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Keiller in view of Jochumson, Kirby and Brown and in further view of Crepy et al. 
(USPN 6,622,121), hereinafter referenced as Crepy. 

Regarding claims 5 and 12, Keiller in view of Jochumson, Kirby and Brown 
discloses an apparatus and method for recording speech, to be used as learning data in 
speech recognition processing, but lacks wherein said presentation means presents the 
unmatched portion so as to identify the type of error as an insertion error, a deletion 
error, or a substitution error, as determined by said determination means. 

Crepy discloses a speech correction device wherein said presentation means 
presents the unmatched portion so as to identify the type of error as an insertion error 
(insertions), a deletion error (deletions), or a substitution error (substitutions), as 
determined by said determination means (column 4, line 65 - column 5, line 1 1), to 
generate an error report. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in combination with Jochumson, Kirby 
and Brown apparatus and method wherein said presentation means presents the 
unmatched portion so as to identify the type of error as an insertion error, a missing 
error, or a substitute error, as taught by Crepy, to generate an error report from which 
various measurements may be derived (column 4, line 65 - column 5, line 1 1). 
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6. Claims 6-7 and 13-14 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Keiller in view of Jochumson, Kirby and Brown and in further view of 
Baker (USPN 6,122,613). 

Regarding claims 6 and 13, Keiller in view of Jochumson, Kirby and Brown 
discloses an apparatus and method for recording speech, to be used as learning data in 
speech recognition processing, but lacks wherein said presentation means 
simultaneously displays the recognized character string and the recording character 
string on a screen by changing a character attribute or a background attribute of an 
unmatched portion or a matched portion of at least one of the recognized character 
string and the recording character string. 

Brown does not specifically teach a speech correction device wherein said 
presentation means simultaneously displays the recognized character string and the 
recording character string on a screen by changing a character attribute or a 
background attribute of an unmatched portion or a matched portion of at least one of the 
recognized character string and the recording character string (highlight uncertainty 
using reverse contrast; column 7, lines 1-16 and column 11, lines 23-30), to identify the 
error. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson, Kirby and Brown 
apparatus and method wherein said presentation means simultaneously displays the 
recognized character string and the recording character string on a screen by changing 
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a character attribute or a background attribute of an unmatched portion or a matched 
portion of at least one of the recognized character string and the recording character 
string, as taught by Baker, to provide the speaker with essentially visual feedback for 
quick and easy review of text and to perform revisions (column 4, line 66 - column 5, 
line 6). 

Regarding claims 7 and 14, Keiller in view of Jochumson, Kirby and Brown 
discloses an apparatus and method for recording speech, to be used as learning data in 
speech recognition processing, but lacks wherein said presentation means 
simultaneously displays the recognized character string and the recording character 
string on a screen by causing unmatched portion or matched portion of at least one 
recognized character string and the recording character string to blink. 

Baker discloses a speech correction device, but does not specifically teach a 
device wherein said presentation means simultaneously displays the recognized 
character string and the recording character string on a screen by causing unmatched 
portion or matched portion of at least one recognized character string and the recording 
character string to blink. 

However, it would have been obvious to one of ordinary skill in the art at the time 
the invention was made that to provide a visual feedback of the uncertainties by 
highlighting the instance of uncertainty (e.g. bold or reverse contrast; column 11, lines 
22-30 with column 7, lines 1-9) would include flashing, to make the error, mistake and 
uncertainty stand out. 



Application/Control Number: 09/976,098 Page 15 

Art Unit: 2626 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Keiller in view of Jochumson, Kirby and Brown's 
apparatus and method wherein said presentation means simultaneously displays the 
recognized character string and the recording character string on a screen by causing 
unmatched portion or matched portion of at least one recognized character string and 
the recording character string to blink, as taught by Baker, to provide the speaker with 
essentially visual feedback for quick and easy review of text and to perform revisions 
(column 4, line 66 - column 5, line 6). 

Conclusion 

7. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

• Iso-Sipilaet et al. (US.PN 6,697,782) disclose a method in the recognition of 
speech and a wireless communication device to be controlled by speech.. 

• Waibel et al. (USPN 5,855,000) disclose a method and apparatus for correcting 
and repairing errors. 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jakieda R. Jackson whose telephone number is 

571 .272.7619. The examiner can normally be reached on Monday through Friday from 
7:30 a.m. to 5:00p.m. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571 .272.7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



JRJ 

December 28, 2006 




DAVID HUDSPETH 
SUPERVISORY PATENT EXAMINER 
TECHNOLOGY CENTER 2600 



